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Abstract 


The  pose  of  a  polyhedral  object  can  be  determined  with  range  data  obtained  from  a  set 
of  simple  light-stripe  range  sensors.  However,  localization  results  axe  highly  dependent  on 
sensor  placement.  This  paper  presents  a  method  for  designing  an  optimal  sensor  placement 
of  three  light-stripe  sensors  with  which  to  determine  the  pose  of  an  arbitrarily  positioned 
object.  We  evaluate  a  sensor  placement  on  the  basis  of  average  performance  measures  over 
the  whole  state  space  of  object  pose  by  a  Monte  Carlo  method.  An  optimaJ  sensor  placement 
is  then  selected  by  another  Monte  Carlo  method  which  sesurches  for  a  maximal  score  function 
of  the  performance  measures  over  the  whole  state  of  sensor  placements. 
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1  Introduction 


Recognizing  the  pose  of  a  three-dimensional  (3-D)  object  in  a  workspace  is  a  fundamental 
task  in  many  computer  vision  applications,  including  automated  assembly,  inspection,  and 
bin  picking.  Many  3-D  object  recognition  systems  which  use  dense  range  images  have  been 
developed  [1].  The  recognition  processes  of  these  systems  are  very  slow,  making  such  tech¬ 
niques  impractical  for  industrial  applications.  While  a  dense  range  image  is  appropriate  for 
describing  a  complex  scene  precisely,  scenes  in  industrial  applications  can  usually  be  simpli¬ 
fied  by  modifying  the  environment.  This  modification  enables  object  recognition  using  only 
simple  sensors  such  as  light-stripe  range  finders.  Simple  range  finders  are  among  the  fastest 
and  least  expensive  ways  to  acquire  accurate  range  data.  Multiple  range  finders  viewing  em 
object  from  different  perspectives  can  usually  provide  enough  constraints  to  determine  the 
pose  of  a  polyhedral  object  [17]. 

The  performance  of  an  object  recognition  system  is  evaluated  with  respect  to  an  error 
rate  of  object  recognition,  recognition  speed  and  pose  determination  error  caused  by  sensing 
and  other  errors.  One  important  issue  for  a  system  with  multiple  sensors  is  that  system 
performance  is  sensitive  to  the  location  of  sensors  in  a  workspace,  that  is,  sensor  placement. 
There  are  two  sensing  strategies:  on-line  planning  and  off-line  planning.  On-line  planning 
selects  the  best  sensing  position  sequentially  and  requires  planning  and  execution  time  be- 
tiveen  measurements.  On  the  other  hand,  off-line  planning  is  desirable  for  industrial  vision 
tasks  because  sensing  positions  are  determined  all  at  once  before  performing  the  tasks. 

In  this  paper,  we  present  an  off-line  method  for  selecting  an  optimal  sensor  placement 
of  three  simple  light-stripe  range  finders  which  are  used  to  determine  the  pose  of  a  polyhe¬ 
dral  object.  Our  method  consists  of  three  techniques:  object  recognition,  pose  uncertainty 
estimation  and  sensor  placement  evaluation.  A  method  for  recognizing  an  object  and  esti¬ 
mating  the  geometric  uncertainty  of  the  object’s  pose  was  previously  described  in  [16].  In 
brief,  the  pose  of  an  object  was  recognized  by  matching  3-D  line  segments  obtained  by  the 
range  finders  to  model  faces  based  on  an  interpretation  tree  search  technique  with  geometric 
constraints.  Then,  the  geometric  uncertainty  of  the  object’s  pose  was  estimated  by  using  a 
relationship  between  sensing  error  and  pose  error. 

By  combining  these  methods,  we  evaluate  the  goodness  of  a  sensor  placement.  The  state 
space  of  the  pose  of  a  3-D  object  has  six  degrees  of  freedom  with  a  uniform  probability 
distribution.  Given  an  object  model  and  a  sensor  pl2u:ement  of  three  range  finders,  an 
average  error  rate  of  object  recognition,  average  recognition  time  and  average  position  error 
in  pose  determination  over  the  state  space  are  estimated  by  a  Monte  Carlo  method.  The 
given  sensor  placement  can  be  evaluated  by  such  expected  average  performance  measures. 

It  is  not  fezisible  to  explore  the  entire  configuration  space  which  represents  an  arbitrary 
sensor  placement  to  find  am  optimal  sensor  placement.  For  simplicity,  we  assume  that  the 
configuration  of  one  range  finder  in  the  workspace  is  defined  by  three  Euler  angles  which 
represent  the  position  and  orientation  of  the  light  plane  of  the  range  finder.  However, 
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there  are  still  many  degrees  of  freedom  to  specify  the  configuration  of  three  range  finders 
simultaneously.  Therefore,  another  Monte  Carlo  method  is  used  to  select  an  optimal  sensor 
placement  from  a  configuration  space  which  consists  of  a  finite  set  of  randomly  generated 
sensor  placements.  Note  that  the  expected  average  performance  of  our  object  recognition 
and  pose  determination  method  under  an  optimal  sensor  placement  C2ui  be  characterized 
completely  via  the  Monte  Csirlo  simulation. 

Related  Work 

The  related  work  on  object  recognition  and  pose  determination  with  sparse  range  data  was 
reviewed  in  [16].  In  brief,  Crimson  and  Lozano-Perez  [9]  demonstrated  that  local  unary 
and  binary  geometric  constraints  are  very  effective  in  reducing  the  size  of  an  interpretation 
tree  which  represents  correspondences  between  sensed  features  and  model  features.  A  least 
squares  method  is  usually  used  to  determine  the  pose  of  an  object  [7],  [13].  Uncertainty 
bounds  on  the  object  position  were  obtained  geometrically  [5],  and  algebraically  [10]. 

Research  on  automatic  placement  of  a  TV  camera  and  a  light  source  for  a  vision  task  has 
been  reported  [4], [20], [22], [23].  Assuming  that  the  position  of  an  object  is  known,  acceptable 
camera  positions  which  simultaneously  satisfy  the  requirements  for  resolution,  field  of  view, 
focus  and  visibility  were  determined  by  combining  the  geometric  relationships  between  the 
camera  positions  and  those  requirements  [4],  and  by  using  an  optimization  function  [22].  An 
approach  which  finds  optimal  sensor  and  light  source  positions  in  terms  of  edge  visibility  was 
discussed  in  [23].  A  system  which  automatically  generates  a  layout  plan  of  local  windows 
in  the  field  of  view  of  a  camera  for  visual  feedback  control  tasks  by  using  a  singular  value 
decomposition  technique  was  described  in  [20].  The  technique  was  also  used  to  determine 
light  source  positions  for  a  photometric  stereo  system.  However,  all  the  systems  described 
here  can  be  applied  only  for  am  object  whose  pose  is  known  approximately. 

Work  on  planning  sensing  strategies  has  been  reported  [6], [10], [14], [18], [19], [21].  Most 
of  the  research,  however,  has  addressed  the  problem  of  selecting  the  next  optimal  sensing 
position  for  object  recognition  amd  localization,  that  is,  on-line  sequential  planning.  During 
initiadization,  some  sensory  measurements  are  necessary  to  reasonably  reduce  the  number 
of  consistent  interpretations  of  object  pose.  Then,  selection  of  the  next  optimaJ  sensing 
position  is  achieved  by  evaluating  which  sensing  position  would  minimize  the  aimbiguity 
of  the  feasible  interpretations.  The  requirements  of  the  initialization  were  not  considered. 
Compared  with  on-line  sequential  planning,  off-line  batch  mode  planning  for  sensing  positions 
is  very  advamtageous.  This  is  because  moving  a  sensor  on-line  is  unacceptable  for  many 
industrial  applications  which  require  high  speed  and  low  cost  system  configuration.  The 
issue  of  finding  a  configuration  of  multiple  sensors  to  minimize  the  pose  uncertainty  of  a  2-D 
object  without  initiad  measurements  was  addressed  in  [24].  However,  a  necessary  condition 
for  the  obtadned  optimad  sensor  placement  is  that  it  be  independent  of  the  object’s  shape. 
This  is  because  a  sensor  placement  is  defined  as  the  orientations  of  the  sensors  relative  to 
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the  observed  object,  and  because  the  sensing  error  characteristics  are  sensitive  only  to  the 
orientations  of  the  sensors.  We  focus  on  designing  an  optimal  sensor  placement  off-line  for 
a  given  polyhedral  object  by  evaluating  sensor  placements  in  terms  of  object  recognition 
performance  and  geometric  uncertainty  in  pose  determination. 

Bayesian  decision  theory  can  be  used  to  determine  optimal  sensing  positions  for  object 
localization  on  the  basis  of  average  performance,  where  the  average  is  based  on  a  probability 
distribution  over  the  state  space  of  a  2-D  object  [2].  Decision-theoretic  principles  with 
geometric  models,  sensor  error  models  and  task  models  were  applied  to  the  problems  of 
optimal  sensing  strategies  and  sensor  data  fusion  in  [12].  Those  approaches  handled  only 
the  pose  uncertainty  of  a  2-D  object  with  known  assignments  between  sensed  features  and 
model  features. 

Goldberg  [8]  proposed  a  stochastic  framework  for  manipulation  planning  where  plans  are 
ranked  on  the  basis  of  expected  cost  and  demonstrated  a  stochasticzdly  optimal  plain  for 
orienting  planar  parts  with  a  programmable  part  feeder.  He  suggested  that  the  stochastic 
planning  can  be  used  to  treat  the  problem  of  finding  an  optimal  sensor  plan  for  recognizing 
an  object.  Stochastic  planning  is  closely  related  Bayesian  decision  theory  in  that  both  require 
a  probabilistic  model  to  evaluate  average  performance.  However,  the  difficulty  is  that  we 
must  explicitly  describe  the  effect  of  a  sensing  operation  with  a  probability  distribution  over 
the  state  space  of  a  3-D  object.  Alternatively,  we  search  for  an  optimal  sensor  placement 
based  on  the  expected  average  performance  of  object  recognition  and  pose  determination 
by  a  Monte  Cairlo  method  assuming  that  the  state  space  of  a  3-D  object  has  a  uniform 
probability  distribution. 

In  this  section,  we  introduced  the  research  objective  and  reviewed  related  work.  Sec¬ 
tion  2  and  3  summarize  our  object  recognition  and  pose  uncertsunty  estimation  techniques 
respectively.  In  Section  4  we  define  some  measures  which  reflect  the  system  performance  of 
object  recognition  ^lnd  pose  determination  under  a  sensor  plaw:ement.  Section  5  introduces  a 
method  for  ranking  sensor  placements  on  the  basis  of  expected  average  performance  of  object 
recognition  and  pose  determination,  and  also  design  an  optimal  sensor  placement  through 
simulation.  In  Section  6,  we  briefly  show  experimental  results  with  three  light-stripe  range 
finders.  The  complete  experiments  on  pose  uncertainty  under  a  designed  optimal  sensor 
placement  are  presented  in  [16]. 


2  Fast  Object  Recognition  with  Light-Stripe  Range 

Finders 


We  begin  with  an  object  recognition  example.  A  simple  light-stripe  range  finder  projects 
a  light  plane  onto  the  faces  of  an  object  and  meMures  3-D  line  segments  created  by  the 
light-stripe  as  shown  in  Figure  1.  Three  identical  range  finders  are  placed  in  the  world 
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Figure  1:  A  simple  light-stripe  range  finder. 


coordinate  £raune  as  shown  in  Figure  2.  The  range  finders  obtain  3-D  line  segments  as  shown 
in  Figure  3.  Our  matching  scheme  which  uses  an  interpretation  tree  search  assigns  the  sensed 
line  segments  to  the  corresponding  model  faces  and  uses  geometric  constraints  to  eliminate 
inconsistent  segment- face  pairings.  The  object’s  pose  is  successfully  determined  as  shown  in 
Figure  4.  In  this  section,  we  briefly  describe  our  object  recognition  and  pose  determination 
technique.  Further  details  are  found  in  [16]. 

2.1  Interpretation  Tree  Search  by  Geometric  Constraints 

The  interpretation  tree  search  technique  with  local  unary  and  binary  geometric  constraints 
finds  a  consistent  set  of  pairings  {Si,Mp^),  (52, Mp,),  ...,  {Sk,Mp^)  where  Mp.  is  a  model 
face  which  corresponds  to  line  segment  5,-.  The  unary  constraints  check  the  consistency  of 
a  pairing  between  a  line  segment  and  a  model  face  cuid  the  binary  constraints  check  the 
consistency  of  two  pairings. 

Our  unary  and  binary  constraints  for  segment-face  matching  are  weaker  than  those  for 
face-face  and  edge-edge  matching  in  Crimson’s  work  [11]  since  line  segments  carry  less  infor¬ 
mation  than  faces  and  edges.  Therefore,  after  applying  the  unary  and  binary  constraints,  we 
apply  triplet  constraints  which  check  a  triplet  of  pairings  between  line  segments  and  model 
faces  to  prune  the  interpretation  tree  more  efficiently.  We  choose  three  line  segments  and 
three  model  faces  under  the  condition  that  two  of  the  line  segments  must  intersect  each 
other.  Since  the  two  line  segments  are  therefore  coplanar,  two  of  the  three  model  faces  must 
be  the  same.  The  intersecting  line  segments  can  be  used  to  calculate  the  normal  of  the  model 
face  on  which  the  line  segments  lie.  The  normal  of  the  other  model  face  can  be  obtained  by 
solving  a  quadratic  equation  since  the  normal  must  be  perpendicular  to  the  direction  vector 
of  the  third  line  segment.  Further  details  of  the  triplet  constraints  may  be  found  in  [16]. 


Figure  2:  Sensor  placement  for  object  recognition.  Sensors  0  and  1  are  placed  on  the  z 
axis,  directed  toward  the  origin.  Their  light  planes,  which  are  displayed  as  triangles,  2ire 
orthogonal.  Sensor  2  is  placed  on  the  z  axis  and  its  light  pl2uie  lies  on  the  z-y  plane. 


Figure  3:  Obtained  3-D  line  segments  on  object  faces. 
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Figure  4:  An  object  recognition  result.  Estimated  transformations  uf(Rx),  <fiRy)  and  k{Rz) 
are  given  in  degrees  and  tg,  ty  and  tg  are  given  in  millimeters.  Re  is  the  standard  deviation 
of  the  distances  between  the  endpoints  of  the  line  segments  and  the  corresponding  object 
faces.  Ti  shows  the  elapsed  time  in  seconds  (Stm  SPARCstation  2). 

2.2  Computing  Transformations 

Next,  we  solve  for  the  rotation  matrix  R  and  the  translation  vector  t  of  the  transformation 
which  maps  points  in  the  model  coordinate  frame  into  the  world  coordinate  frame  in  such  a 
manner  that  each  line  segment  lies  on  the  corresponding  model  face.  A  point  p  in  the  world 
coordinate  frame  is  related  to  a  corresponding  point  P  in  the  model  coordinate  freune 

p  =  RP  +  t.  (1) 

Suppose  that  a  line  segment  5,-,  whose  endpoints  Me  6j  emd  Cj,  corresponds  to  a  model  face 
Mpj.  If  the  point  p  is  on  the  line  segment  Si,  the  squared  distance  from  the  point  to  the 
corresponding  model  face  is  given  by 

(AdiY  =  (iV^  (R-'ip  -  0)  +  (2) 

where  iVp.  amd  Dpi  are  the  unit  norm2d  and  offset  of  the  model  face  Mpi  respectively.  The 
rotation  and  translation  components  are  therefore  obtained  by  minimizing  the  sum  of  the 
integral  of  the  squared  distance  along  each  line  segment  over  all  pairings  of  an  obtained 
feasible  interpretation  {Si,  Mpi)  for  z  =  1, . . . ,  A: 

^  =  E  ti^difdsi  (3) 

i=i 

where  dsi  is  an  element  of  line  segment  5,-.  An  initial  rotation  component  for  minimization  is 
obtained  by  using  a  geometric  relationship  among  three  segment-face  pairings  which  include 
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intersecting  line  segments.  In  the  event  that  the  three  pairings  do  not  include  intersecting 
line  segments,  a  numerical  polynomiad-based  technique  [3]  is  used  to  obtain  a  rotation  com¬ 
ponent.  Unfortunately,  the  polynomial-based  method  is  very  sensitive  to  noise  and  is  also 
computationally  expensive  since  an  eighth-degree  equation  must  be  solved.  On  the  other 
hand,  the  method  which  uses  intersecting  line  segments  is  very  fast  and  robust  since  a  rota¬ 
tion  component  is  obtained  by  solving  a  quadratic  equation  in  the  triplet  constraint  check. 
An  initial  translation  component  is  computed  by  a  least  squares  method. 

3  Geometric  Uncertainty  in  Pose  Determination 

Now  we  can  determine  the  pose  of  an  object.  However,  due  to  sensing  error  inherent  in 
me2isuring  line  segments,  the  obtained  transformation  contains  some  error  which  causes 
uncertainty  in  the  position  estimate  of  the  object.  This  section  describes  our  technique  for 
estimating  the  pose  uncertainty. 

3.1  Estimating  Pose  Uncertainty 

Let  be  transformation  variables,  and  let  8={xi,y\,zi,. .  .,X2kyy2k,Z2kV  be  a 

vector  of  endpoint  pairs  and  (x2i,y2«>-Z2i)  of  line  segments  5,  for  i  =  1, . . . , 

The  pose  of  an  object  is  determined  by  minimizing  the  residual  E  of  equation  (3)  with  respect 
to  X.  The  necessary  condition  for  E  to  reach  an  extremum  is  given  as 

dtx  dty  dtz  du  dip  Bk 

Now  to  examine  the  transformation  error  Ax  caused  by  the  sensing  error  As,  we  linearize 
these  non-linear  equations  around  the  approximate  solution  (xq,  So)  which  corresponds  to 
the  correct  transformation  and  endpoints, 

AAx  ^  -BAs  (5) 

where  A  is  the  Hessian  matrix  of  E  with  respect  to  x  and  B  is  the  Jacobian  matrix  of 
with  respect  to  s. 

Furthermore,  a  relationship  between  the  transformation  error  Ax  and  the  position  error 
Avj  of  a  vertex  Vj  is  given  by 

Avj  =  DjAx  (6) 

where  Dj  is  the  Jacobian  matrix  of  Vj  with  respect  to  x.  By  substituting  equation  (5)  into 
equation  (6),  the  covariance  matrix  of  the  vertex  Vj  is  given  by 

=  EiAvjAvJ) 

=  Dj{A-^B)CziA-^BfDj  (7) 
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Figure  5:  An  uncertainty  estimation  result  after  recognizing  the  object.  Three  bars  on  each 
vertex  show  the  uncertainty  in  pose  determination.  £Jr(nun)  is  the  average  position  error  of 
all  vertices. 


where  C,  is  the  covariance  matrix  of  the  line  segments’  endpoint  positions.  The  elements 
of  the  covariance  matrix  describe  the  uncertainty  in  vertex  position,  and  hence  the  i,  y 
and  z  components  of  the  position  error  of  each  vertex  can  be  approximated  as 


(8) 


3.2  Example 

The  following  is  an  example  of  estimating  geometric  uncertainty  in  pose  determination. 
Given  the  shape  of  an  object,  the  object’s  pose  in  world  coordinates,  and  a  sensor  placement 
of  three  light-stripe  range  finders,  a  range  finder  simulator  calculates  line  segments  which 
would  appear  on  the  object.  We  assume  that  all  endpoints  of  obtained  line  segments  have  the 
same  error  (zero  mean  Gaussian  white  noise  with  standard  deviation  of  1mm).  We  further 
assume  that  any  two  endpoints  are  independently  meeisured  emd  that  their  respective  errors 
are  not  related  (though  the  mechanism  of  the  sensing  error  of  a  range  finder  is  complex 
in  practice  [15]).  Thus,  the  covariance  matrix  C,  of  the  line  segments’  endpoint  positions 
becomes  the  identity  matrix.  We  can  estimate  the  uncertainty  of  each  vertex  of  the  object 
with  equation  (7). 

Given  a  model  as  shown  in  Figure  1,  a  sensor  placement  as  in  Figure  2,  «ind  the  same 
transformation  <is  in  Figure  4,  the  estimated  uncertainty  on  each  vertex  of  the  object  is 
shown  in  Figure  5.  In  this  figure,  the  lengths  of  three  bars  on  each  vertex  along  i,  y  and  z 
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directions  axe  given  by  equation  (8),  and  show  the  uncertainty  associated  with  the  position 
of  each  vertex.^ 


4  Measures  for  Evaluating  Sensor  Placements 


Given  the  shape  and  pose  of  an  object  and  a  sensor  placement  of  three  light-stripe  range 
finders,  we  can  decide  whether  or  not  the  object  is  recognizable  and  also  we  cein  estimate 
the  uncertainty  in  the  object’s  pose.  In  this  section,  we  show  that  the  goodness  of  a  sensor 
placement  can  be  evaluated  through  simulation  using  measures  which  reflect  the  performance 
of  object  recognition  and  pose  determination  assuming  the  given  sensor  placement. 

4.1  Performance  Measure  in  Object  Recognition 

We  test  our  object  recognition  method  using  simulations.  Three  hypothetical  light-stripe 
range  finders  are  placed  in  the  world  coordinate  frame  as  shown  in  Figure  2.  A  polyhedral 
object  as  shown  in  Figure  1  is  then  placed  in  the  world  coordinate  frame  with  a  randomly 
generated  transformation  for  the  i  th  object  recognition  trial. 

As  input  data  for  the  recognition  program,  a  range  finder  simulator  calculates  3-D  line 
segments  which  the  three  light-stripe  range  finders  would  get  from  viewing  the  object.  We 
obtain  feasible  interpretations  by  performing  the  interpretation  tree  seeurch  with  the  geomet¬ 
ric  constraints.  If  all  the  estimated  vertex  positions  of  each  feasible  interpretation  axe  near 
enough  to  the  corresponding  correct  positions,  the  interpretation  is  regarded  as  correct.  The 
simulation  reports  that  949  of  1000  trials  are  successful  and  that  the  average  recognition 
time  is  0.06  seconds.  All  failed  trials  correspond  to  multiple  interpretations  which  include 
some  correct  and  some  incorrect  interpretations. 

This  simulation  suggests  that  am  arbitrary  sensor  placement  can  be  evaluated  with  many 
recognition  trials  using  a  Monte  Carlo  method.  The  percentage  of  fauled  recognition  trials  and 
the  average  computation  time  per  triad  indicate  how  good  the  sensor  placement  is  for  object 
recognition.  One  problem  is  how  many  trials  should  be  done  to  evaluate  a  sensor  placement. 
Simulation  results  of  1000,  5000  and  10000  trials  under  five  different  sensor  placements 
are  shown  in  Table  1.  The  percentage  of  failed  recognition  trials  and  the  recognition  time 
are  almost  the  saime  regardless  of  the  number  of  trials.  Thus,  1000  trials  are  sufficient  for 
sensor  placement  evaluation  since  the  improvements  gained  by  using  additional  trials  are  not 
considered  crucial. 


‘For  display  purpose,  those  lengths  equal  12Avj,,  l2Avj^  and  12Av;.  respectively. 
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Table  1:  The  percentage  of  failed  trails  Pjait  (%)  and  the  average  computation  time  T  (sec) 
for  N  =1000,  5000  and  10000  under  five  different  sensor  placements. 


Sensor 

N  = 

1000 

N  = 

5000 

N  = 

10000 

Placement 

P/ail 

mm 

m 

Pjail 

No.  1 

5.1 

0.059 

4.9 

0.062 

4.9 

0.061 

No.  2 

31.1 

0.067 

31.8 

0.071 

31.7 

0.071 

No.  3 

4.9 

0.087 

5.1 

0.072 

4.8 

0.070 

No.  4 

14.5 

0.070 

15.2 

0.071 

15.0 

0.069 

No.  5 

11.1 

0.254 

10.5 

0.225 

10.4 

0.229 

4.2  Performance  Measure  in  Pose  Determination 


Our  method  can  estimate  the  position  error  of  an  object  when  the  object’s  pose  has  been 
determined.  Therefore,  a  Monte  Carlo  method  is  used  here  agzun  to  estimate  the  average 
position  error  of  the  vertices  of  an  object  under  a  sensor  placement  with  a  set  of  randomly 
generated  transformations. 

For  the  i  th  transformation,  a  maximal  position  error  e,  over  all  vertices  of  the  object  is 


defined  as 


(9) 


where  C'  is  a  diagonalized  matrix  of  the  covariance  matrix  given  by  equation(7)  and 
n  is  the  number  of  the  vertices.  The  average  position  error  E  for  a  set  of  transformations 
ti)  for  e  =  1, . . . ,  W  is  obtained  as 


E  = 


(10) 


The  probable  error  ^E  of  the  position  error  estimate  E  is  defined  as 


AE  = 


iLi=i _ 


N 


(11) 


The  probable  error  AE  is  inversely  proportional  to  the  square  root  of  the  number  of  trials 
N,  which  is  regarded  as  a  characteristic  of  a  Monte  Carlo  method. 

Given  an  object  as  shown  in  Figure  1,  and  a  set  of  transformations,  an  estimated  average 
position  error  and  its  probable  error  under  the  five  sensor  placements  from  Table  1  ^lre 
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Table  2:  The  average  position  error  E  (mm)  and  its  probable  error  A£  (mm)  under  five 
different  sensor  placements. 


Sensor 

Placement 

N  =  1000 

N  =  5000 

N  =  10000 

E 

AE 

E 

AE 

E 

AE 

No.  1 

1.61 

0.024 

1.63 

0.009 

1.63 

0.007 

No.  2 

3.37 

0.067 

3.47 

0.031 

3.49 

0.023 

No.  3 

2.04 

0.031 

2.07 

0.015 

2.08 

0.011 

No.  4 

2.46 

0.051 

2.52 

0.026 

2.51 

0.019 

No.  5 

2.30 

0.036 

2.30 

0.017 

2.30 

0.012 

reported  in  Table  2.  The  results  show  that  the  average  position  e.  or  varies  depending  on  a 
sensor  placement,  and  hence  the  value  can  be  used  as  a  performance  measure  for  evaluating 
a  sensor  placement.  Judging  from  the  ratio  AE/E,  1000  trials  are  sufficient  to  estimate  an 
average  position  error. 

In  summary,  a  sensor  placement  can  be  evaluated  with  1000  randomly  generated  trans¬ 
formations  in  terms  of  the  following  performance  measures: 

•  Percentage  of  failed  recognition  trials  Pj^ii 

•  Average  recognition  time  T 

•  Average  position  error  E, 

5  Sensor  Placement  Design  for  Object  Pose 

Determination 


A  sensor  placement  is  assigned  a  triplet  of  performance  measures  [Pfaii-,  T,  E)  using  a  Monte 
Carlo  method.  Our  problem  is  to  find  a  good  sensor  placement  with  which  an  object  in 
an  arbitrary  pose  would  be  always  recognizable  with  minimal  computation  time  and  with 
minimal  pose  uncertainty.  Therefore,  sensor  placements  must  be  ranked  on  the  basis  of 
the  performance  me<isures  to  select  an  optimal  sensor  placement.  In  this  section,  we  define 
a  configuration  space  which  represents  all  possible  sensor  placements,  introduce  a  scalar 
function  to  rank  the  sensor  placements,  and  then  design  an  optimal  sensor  placement  through 
simulation. 
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Figure  6:  The  definition  of  a  light  plane.  The  light  plane  is  defined  by  three  Euler  angles 
(a,  /?,  7)  and  a  radius  r  (constant). 

5.1  Configuration  Space  of  Sensor  Placements 

Suppose  that  we  place  three  light-stripe  range  finders  on  the  surface  of  a  sphere  whose  center 
is  located  at  the  origin  of  the  world  coordinate  frame.  The  location  of  each  range  finder  is 
specified  by  a  light  source  position,  a  light  plane  and  a  viewpoint  which  corresponds  to  a  TV 
camera  position.  Since  there  are  many  degrees  of  freedom  to  specify  a  sensor  placement,  we 
assume  the  following  conditions  to  make  computation  tractable: 

•  The  range  finders  can  be  placed  only  on  the  upper  hemisphere  in  practice. 

•  The  radius  of  the  sphere  is  const2mt  according  to  the  size  of  a  workspace. 

•  One  range  finder  is  placed  at  the  north  pole  of  the  sphere,  directing  to  the  sphere 
center  and  its  light  plane  is  aligned  with  the  z-x  pleuie  without  loss  of  generality. 

•  The  light  planes  of  the  other  range  finders  also  pass  through  the  sphere  center. 

•  The  light  source  and  viewpoint  of  each  range  finder  are  coincident.^ 

We  use  three  Euler  angles  a,  ^  and  7  to  represent  a  sensor  placement  6  as  shown  in 
Figure  6.  The  light  plane  ir  is  given  by 

lx  +  Tny  +  nz  =  0  (12) 

non-zero  baseline  complicates  simulation  by  adding  occluded  line  segments  to  the  data.  In  simulation, 
however,  we  can  avoid  this  problem  by  assuming  a  zero  baseline.  Range  is  computed  by  intersecting  the 
light  plane  with  the  model. 
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where 


I  =  —  cos  a  cos /9  sin  7  —  sin  a  cos  7 

m  =  —  sin  a  cos /S  sin  7  4- cos  Of  cos  7  (13) 

n  =  sin sin  7. 

The  light  source  position  L,  is  (r  cos  a  sin  r  sin  a  sin  0,  r  cos  where  r  is  the  radius  of  the 

sphere.  The  ranges  of  the  Euler  angles  are  given  by  0  <  a  <  2ff,  0  <  /?  <  t/2,  0  <  7  <  t. 

Accordingly,  a  sensor  placement  6  is  described  with  two  sets  of  Euler  angles  (o;i,/di,7i)  and 

(aa,  72)  corresponding  to  the  two  movable  sensors.  Since  a  sensor  placement  is  represented 
by  the  continuous  spaces  of  such  Euler  angles,  we  must  partition  these  spau:es  into  a  finite 
set  of  sensor  placements,  which  is  called  the  configuration  space  6. 

5.2  Ranking  Sensor  Placements 

It  is  not  always  possible  to  find  an  optimal  sensor  placement  which  has  the  best  performance 
with  respect  to  aU  the  measures  simultaneously.  Thus,  we  introduce  a  scalar  function  which 
combines  the  performance  measures  to  give  a  score  to  each  sensor  placement.  Let  Xi  {i  = 
I  3)  be  values  of  a  triplet  of  a  sensor  placement  Omy  and  let  z,  and  ai  be  the  mean  and  the 
standard  deviation  of  Zj  over  the  configuration  space.  We  define  a  score  Sm  for  the  sensor 
placement  as 

=  (M) 

i=l  '  ' 

where  Wi  is  a  weight.  This  equation  expresses  how  far  the  performance  measure  z,-  of  a 
sensor  placement  deviates  from  the  mean  Zj.  The  weight  u;,-  decides  how  e£K:h  performance 
measure  contributes  to  the  total  score 

Over  all  sensor  placements,  the  maximal  score  is 

S*  =  maxSm-  (15) 

e 

Hence  an  optimaJ  sensor  placement  is  defined  as  a  sensor  placement  with  maximal  score 
5*  among  the  configuration  space.  By  this  definition  there  may  be  more  than  one  optimal 
sensor  placement  due  to  ties.  Thus  to  be  precise  we  should  refer  to  “an”  optimal  sensor 
placement  rather  than  “the”  optimal  sensor  placement  according  to  Goldberg’s  work  [8]. 

5.3  Sensor  Placement  Design 

We  axe  now  reawly  to  design  an  optimal  sensor  placement  for  three  light-stripe  range  finders. 
However,  exploring  the  entire  configuration  space  of  sensor  placements  is  computationally 
too  expensive.  Therefore,  we  introduce  another  Monte  Carlo  approach  as  a  strategy  of 
selecting  an  optimal  sensor  placement.  The  procedure  is  as  follows: 
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Figure  7:  The  sensor  placement  with  the  highest  score  for  the  model  No.  1. 

•  Generate  a  set  of  Af  sensor  placements  at  random  with  two  sets  of  Euler  angles 
(au0i,-ri)m  and  (Qt2,/32,72)m  for  m  = 

•  Estimate  the  performance  measures  of  each  sensor  placement. 

•  Combine  them  to  give  a  score  to  the  sensor  placement. 

•  Select  an  optimal  sensor  placement  which  has  a  maximal  score  among  all  the  sensor 
placements. 

5.4  Simulation  Results 

We  use  the  object  model  as  shown  in  Figure  1,  and  select  an  optimal  sensor  placement  from 
1000  randomly  generated  sensor  placements.  The  simulation  of  the  m  th  sensor  placement 
takes  as  input  1000  different  poses  of  the  object  model  with  randomly  generated  trans¬ 
formations,  estimates  the  performance  measures,  and  computes  the  score  Sm-  The  sensor 
placement  which  has  the  highest  score  5*=13.1  for  the  object  model  is  shown  in  Figure  7 
(The  second  highest  score  is  12.5).  The  triplet  for  the  sensor  pl2u:ement  is  {Pfaii,T,  E)=(1.6 
%,  0.08  sec,  1.66  mm).  Here,  the  weights  Wi  are  set  as  (  4,  2,  4  ).  The  same  simulation  with 
a  different  object  model  No.  2  as  shown  in  Figure  8  finds  the  optimal  sensor  placement  2is 
shown  in  Figure  9. 

The  triplet  values  for  the  optimal  sensor  placement  and  the  statistics  of  estimated  perfor¬ 
mance  measures  for  the  two  models  are  shown  in  Table  3.  Object  recognition  for  the  model 
No.  1  is  more  difficult  since  the  mean  and  median  of  P/aii  are  much  larger  than  those  of  the 
model  No.  2.  Note  that  ranking  of  sensor  placements  changes  according  to  the  weights  Wi. 
The  weights  tOj  must  be  set  by  requirements  of  a  vision  task. 
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Table  3:  The  triplet  values  for  the  optimal  sensor  placements  and  the  statistics  of  the 
performance  measures  for  the  two  models. 


Measures 

PtaH  (%) 

T  (sec) 

E  (mm) 

5,n 

No.  1 

Optimal 

1.6 

0.08 

1.66 

13.1 

Mean 

10.7 

0.19 

2.28 

0.0 

Std 

0.23 

0.42 

Will 

Median 

9.7 

0.09 

2.22 

1.6 

No.  2 

Optimal 

0.2 

0.10 

1.67 

10.3 

Mean 

1.9 

0.36 

2.33 

0.0 

Std 

0.23 

0.48 

8.3 

Median 

1.0 

0.30 

2.23 

2.0 

The  tendency  of  ranking  of  the  randomly  generated  sensor  placements  is  similar  for  the 
two  models,  though  the  optimal  sensor  placement  is  different  between  them.  Relatively  good 
sensor  placements  for  one  model  we  relatively  good  for  the  other  model.  The  characteristics 
of  such  good  sensor  placements  are  summarized  as  follows: 

•  Two  range  finders  are  closely  located,  and  the  associated  light  planes  are  almost  per¬ 
pendicular. 

•  The  other  range  finder  is  far  from  the  others. 

These  observations  csui  be  supported  not  only  from  the  point  of  view  of  geometric  uncertainty 
in  pose  determination,  but  2dso  from  a  chauracteristic  of  our  object  recognition  technique; 
computation  time  for  recognition  with  intersecting  line  segments  is  absolutely  shorter  than 
that  without  intersecting  line  segments  [17].  Under  such  a  sensor  pl£M:ement,  intersecting 
line  segments  would  more  often  appear  on  an  object  face. 

6  Experimental  Results 


Two  simulation  results  for  selecting  an  optimal  sensor  placement  of  three  light-stripe  range 
finders  were  shown  in  the  previous  section.  This  section  briefly  presents  experimental  results 
of  recognizing  an  object  and  estimating  pose  uncertainty  under  the  optimal  sensor  placement. 
The  complete  experiments  on  pose  uncertainty  under  the  designed  optimal  sensor  placement 
are  presented  in  [16]. 
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Figure  10:  Experimented  3-D  line  segments  and  object  recognition  and  position  error  esti¬ 
mation  results  for  an  arbitrary  pose. 


Figure  11:  Simulated  3-D  line  segments  and  object  recognition  and  position  error  estimation 
results  f^r  the  object’s  pose  shown  in  Figure  10. 
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Each  light-stripe  rzuige  finder  is  composed  of  a  TV  camera  with  a  16  mm  lens  <ind  a  laser 
diode  projector  whose  wavelength  is  670  nm.  The  laser  beam  is  spread  by  a  cylindrical  lens 
to  generate  a  light  plane.  The  baseline  length  between  the  TV  camera  and  the  laser  projector 
is  about  100  mm.  We  place  three  identical  range  finders  above  the  workspace  according  to 
the  configuration  of  the  designed  optimal  sensor  placement  for  the  model  No.  1  as  shown 
in  Figure  7.  The  distance  between  each  range  finder  and  the  workspace  center  is  about  350 
mm  and  each  range  finder’s  absolute  accuracy  for  measuring  3-D  coordinates  is  ±  0.5  mm 
within  the  workspace. 

An  object  lil»  the  one  depicted  in  Figure  1  is  placed  at  an  arbitreuy  pose  in  the  workspace. 
Each  range  finder  takes  two  images  (one  with  the  laser  diode  on,  one  with  the  diode  off)  and 
obtains  3-D  line  segments.  Figure  10  shows  obtained  3-D  line  segments  md  object  recognition 
and  position  error  estimation  results.  For  comparison,  Figure  11  shows  a  simulation  result 
with  the  same  object  pose  under  the  same  sensor  placement  as  the  experiment  shown  in 
Figure  10.  The  recognition  time  in  the  experiment  is  0.67  sec,  while  only  0.05  sec  in  the 
simulation.  In  the  experiment,  the  geometric  constraints  used  in  the  interpretation  tree 
search  were  weakened  to  allow  for  error  in  the  measurement,  thus,  increasing  the  number 
of  visited  nodes.  We  tried  similu  experiments  with  several  different  poses.  A  few  3-D  line 
segments  were  occluded  in  some  experimental  results,  while  the  line  segments  appeared  on 
object  faces  in  corresponding  simulation  results.  This  is  because  the  range  finder  simulator 
regards  the  light  source  and  the  viewpoint  as  the  same  point.  Throughout  the  trials,  the 
experimental  results  are  consistent  with  the  simulation  results  except  for  recognition  time 
and  occlusion. 


7  Conclusion 


An  object  recognition  system  with  simple  sensors  has  two  advantages:  a  simple  sensor  like 
a  light-stripe  range  finder  is  very  fast,  cheap,  reliable  and  yet  provides  very  accurate  data; 
sensory  data  are  sparse  but  have  enough  constraints  to  determine  the  pose  of  a  polyhedral 
object.  Finding  am  appropriate  sensor  placement  is  a  central  problem  for  such  a  multi-sensor 
system.  Off-line  batch  mode  plamning  is  indispensable  for  many  industrial  vision  tasks  which 
require  quickness  and  low  cost  system  configuration. 

In  this  paper,  we  have  presented  a  method  for  designing  an  optimal  sensor  placement 
when  using  three  light-stripe  range  finders  to  determine  the  pose  of  a  polyhedral  object. 
We  evaluate  the  goodness  of  an  arbitrary  sensor  placement  with  performance  measures:  an 
error  rate  of  object  recognition,  recognition  time  and  pose  uncertainty.  An  optimal  sensor 
placement  is  selected  by  ranking  ramdomly  generated  sensor  placements  with  a  Monte  Carlo 
method.  Experimental  results  are  in  agreement  with  simulation  results.  An  emphasized 
point  is  that  the  expected  average  performance  of  object  recognition  and  pose  determination 
under  an  optimal  sensor  placement  can  be  characterized  completely  via  simulation.  Our 
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method  is  applicable  to  object  pose  determination  tasks  as  a  designing  tool  for  a  sensor 
placement. 
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